Characterizing Statistical Query Learning: Simplified Notions and Proofs

نویسنده

  • Balázs Szörényi
چکیده

The Statistical Query model was introduced in [6] to handle noise in the well-known PAC model. In this model the learner gains information about the target concept by asking for various statistics about it. Characterizing the number of queries required by learning a given concept class under fixed distribution was already considered in [3] for weak learning; then in [8] strong learnability was also characterized. However, the proofs for these results in [3, 10, 8] (and for strong learnability even the characterization itself) are rather complex; our main goal is to present a simple approach that works for both problems. Additionally, we strengthen the result on strong learnability by showing that a class is learnable with polynomially many queries iff all consistent algorithms use polynomially many queries, and by showing that proper and improper learning are basically equivalent. As an example, we apply our results on conjunctions under the uniform distribution.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Limits on Exact Learning from Membership and Equivalence Queries

In this paper we look at different combinatorial properties of concept classes that give bounds on the query complexity of exact learning. We consider learning from equivalence queries, membership queries and a combination of equivalence and membership queries. We examine and present proofs regarding efficient query-learnability as it relates to the notions of polynomial certificates, approxima...

متن کامل

On the Size of Convex Hulls of Small Sets

We investigate two di erent notions of \size" which appear naturally in Statistical Learning Theory. We present quantitative estimates on the fat-shattering dimension and on the covering numbers of convex hulls of sets of functions, given the necessary data on the original sets. The proofs we present are relatively simple since they do not require extensive background in convex geometry.

متن کامل

Query-Relevant Summarization using FAQs

This paper introduces a statistical model for query-relevant summarization: succinctly characterizing the relevance of a document to a query. Learning parameter values for the proposed model requires a large collection of summarized documents, which we do not have, but as a proxy, we use a collection of FAQ (frequently-asked question) documents. Taking a learning approach enables a principled, ...

متن کامل

Traceable Sets

We investigate systematically into the various possible notions of traceable sets and the relations they bear to each other and to other notions such as diagonally noncomputable sets or complex and autocomplex sets. We review known notions and results that appear in the literature in different contexts, put them into perspective and provide simplified or at least more direct proofs. In addition...

متن کامل

Query expansion and dimensionality reduction: Notions of optimality in Rocchio relevance feedback and latent semantic indexing

Rocchio relevance feedback and latent semantic indexing (LSI) are well-known extensions of the vector space model for information retrieval (IR). This paper analyzes the statistical relationship between these extensions. The analysis focuses on each method’s basis in least-squares optimization. Noting that LSI and Rocchio relevance feedback both alter the vector space model in a way that is in ...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2009